Inhambu: Data Mining Using Idle Cycles in Clusters of PCs

نویسندگان

  • Hermes Senger
  • Eduardo R. Hruschka
  • Fabrício Alves Barbosa da Silva
  • Liria Matsumoto Sato
  • Calebe De Paula Bianchini
  • Marcelo D. Esperidiãao
چکیده

In this paper we present and evaluate Inhambu, a distributed objectoriented system that relies on dynamic monitoring to collect information about the availability of computational resources, providing the necessary support for the execution of data mining applications on clusters of PCs and workstations. We also describe a modified implementation of the data mining tool Weka, which executes the cross validation procedure in parallel with the support of Inhambu. We present preliminary tests, showing that performance gains can be obtained for computationally expensive data mining algorithms, even when running with small datasets.1

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Exploiting idle cycles to execute data mining applications on clusters of PCs

In this paper we present and evaluate Inhambu, a distributed object-oriented system that supports the execution of data mining applications on clusters of PCs and workstations. This system provides a resource management layer, built on the top of Java/RMI, that supports the execution of the data mining tool called Weka. We evaluate the performance of Inhambu by means of several experiments in h...

متن کامل

An Idle Compute Cycle Prediction Service for Computational Grids

The utilization of idle compute cycles has been known as most promising and cost-effective way to build a large scale high performance computing system, but not widely used because of the lack of effective idleness prediction techniques. In this paper, we argue PCs at university computer labs have a great potential for the utilization of idle CPU cycles, and propose two techniques for predictin...

متن کامل

A commodity platform for Distributed Data Mining - the HARVARD System

Systems performing Data Mining analysis are usually dedicated and expensive. They often require special purpose machines to run the data analysis tool. In this paper we propose an architecture for distributed Data Mining running on general purpose desktop computers. The proposed architecture was deployed in the HARVesting Architecture of idle machines foR Data mining (HARVARD) system. The Harva...

متن کامل

An Analysis of Idle CPU Cycles at University Computer Labs

Grid computing has a great potential for grand challenge scientific problems such as Molecular Simulation, High Energy Physics and Genome Informatics. Exploiting under-utilized resources is crucial for a cost-effective, large-scale grid computing platform (i.e., computational grid), but there has been little research work on how to predict what resources will be under-loaded in the near future....

متن کامل

Volunteer Computing on Clusters

Clusters typically represent a homogeneous, well maintained pool of high-end computation resources. This makes them particularly attractive for volunteer computing, where unused compute cycles are utilized for scientific guest applications. Cluster nodes are not idle as often as public PCs, but they are frequently underutilized while actively executing parallel applications. Hence, fully exploi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004